# A tibble: 8 × 4
Region Rshort Company Sales
<chr> <chr> <chr> <dbl>
1 North N Co.A 13
2 North N Co.B 39
3 South S Co.A 35
4 South S Co.B 6
5 East E Co.A 27
6 East E Co.B 27
7 West W Co.A 25
8 West W Co.B 28
Zelazny Plots: Alternative Plotting with GGPlot
Welcome
This tutorial allows you to interactively explore R, writing and running code in directly your web browser. You’ll see above R is being set-up and loading packages right in this browser window.
This tutorial was created by Brennan Antone using WebR and Quarto. It adapts a classic exercise from: Zelazny, G. (1985). Say it with charts: the executive’s guide to visual communication. McGraw-Hill Education.
Introduction
In his classic exercise, Zalazny presented a simple data set comparing sales across for four regions for two companies: Company A and Company B. He asked participants to consider different ways of plotting the data, and what the implications of these approaches might be.
We will extend this exercise by using the ggplot2 package in R to demonstrate how different forms of visualizations can be implemented, alongside the dplyr and tidyr packages for data manipulation.
Challenge
Here is the example data, as presented in his figure:
It describes the sales of two companies, Company A and Company B, in June. The values show the percentage of each companies’ sales the originated from four regions: North, South, East, and West.
Readers are challenged to brainstorm different ways this could be converted to a chart, and then to consider the implications of choosing different types of charts on the message conveyed.
Data Setup
Running the code below, we can set up the data in a simple tibble.
I’ve added an additional column, Rshort, which contains the one letter abbreviations of the region names. This is helpful for adjusting whether we want short or long labels.
This is a simple tibble with eight values. But is it “tidy data” based on Wickham definition?
Placing the data in the proper format for the type of plot you want is key when using ggplot. If we have eight “bars” we want to create, the simplest approach is to create a tibble with eight rows. Each case will then correspond to one bar. For instance, see the transformed data below:
How do we move from our original data to the new format? Using the pivot_longer command, we can make a tibble longer - meaning more rows and fewer columns.
We do this by specifying the columns we want to transform into additional rows, as well as how to rename the new columns in our tibble.
Different Plots
This same tibble can be used to generate many different possible configuration of plots. Zelazny created illustrations to show six of the many different possibilities:
Here is a simple template we can use for the basics of the plot. It sets up a very simple plot.
We can see in this default plot there is no variability on the x-axis. Height of each segment (y-axis) corresponds to sales. The fill color corresponds to the Region, with ggplot automatically producing a legend when color is added.
Now, let’s get started on recreating the actual plots!
Plot 1: Stacked Columns
To re-create this plot, we will want to use facet_wrap("x") to create multiple facets based on a variable x. We can also add theme_void(base_size = 17) or theme_bw(base_size = 17) to move to a simpler visual style closer to the illustration. The base_size will make the text bigger.
Finally, how do we add text labels on each bar segment? First, we need to add geom_text(size = 5) to place text on the plot. In the aesthetic mappings ( aes() ) we need to specify what variable maps onto the label (i.e. label = x ). Lastly, we need to make sure the labels are positioned properly. Add the following argument to geom_text() : position = position_stack(vjust = 0.5) . You can try adding/removing this, and adjusting vjust between 0 and 1 to see what this does to the positon of text.
This will look pretty good. But - our regions may not be in the same order as the ones in the illustration. How can we fix it? We can adjust the ordering of a factor using the mutate command on data2 before we plot it. We can get an altered data fram with:
mutate(data2, Region = fct_relevel(Region,c("West","East","South","North")))
Try to set this up below using the pipe operator |>. With two pipes, we should be able to pass data2 into mutate() and then pass the results of the mutate into ggplot().
Plot 2: Bar Charts
Now, building on the elements we already discussed, let’s create these bar charts. To reposition the bars, we will need to adjust the aesthetic mapping aes() of what appears on the x-axis and the y-axis.
Note that, if we want one-letter names for regions, we can use the variable Rshort we created instead of the variable Region. If we want to remove the legend from appearing on the plot we can add the code theme(legend.position = "none"). If we want to change the y-axis label, we can add ylab("y").
Plot 3: Mirrored Bar Charts
Mirroring the directions of the plots is a little trickier. We will need to subset the data, and then arrange it opposite directions. Experiment with altering the code provided below to see what each line does.
Code provided below:
Plot 4: Different Ordered Bars
Now, let’s try to apply what we learned from Plot 3 to create Plot 4.
To get separate orders on each bar, we will have to first use facet_wrap("Company", scales = "free_y") to allow the y-axis to differ for each. We will then need to add two separate geom_bar() functions, each operating on a different subset of the data. Within each subset, we will have to plot the fct_reorder() to reorder Rshort based on Sales.
Plot 5: Clustered Bar Charts
This next one is a little easier - no need to subset the data. Just mutate() data2 to adjust the order on Rshort. No need for facets. We will then just adjust the positon of the bars with: geom_bar(stat="identity",width=0.5,position = position_dodge(width = 0.5))
Plot 6: Pie Charts
For this final plot, adding coord_polar("y",start0) is a quick way to move from rectangular coordinates (e.g., bars) to polar coordinates (e.g., pie charts). You will need to mutate the data to get the pie chart segments to match the order show above.
Takeaways
This tutorial has demonstrated that the same data can generate different types of visualizations.
The ggplot2 package, by using the Grammar of Graphics approach, provides you with flexibility to create different plots by combining different types of elements.